Interactive optimization of embedding-based text similarity calculations
نویسندگان
چکیده
Comparing text documents is an essential task for a variety of applications within diverse research fields, and several different methods have been developed this. However, calculating similarity ambiguous context-dependent task, so many open challenges still exist. In this paper, we present novel method calculations based on the combination embedding technology ensemble methods. By using embeddings, instead only one, show that it possible to achieve higher quality, which in turn key factor developing high-performing exploitation. We also provide prototype visual analytics tool helps analyst find optimal performing ensembles gain insights inner workings calculations. Furthermore, discuss generalizability our ideas fields beyond scope analysis.
منابع مشابه
Link Prediction using Network Embedding based on Global Similarity
Background: The link prediction issue is one of the most widely used problems in complex network analysis. Link prediction requires knowing the background of previous link connections and combining them with available information. The link prediction local approaches with node structure objectives are fast in case of speed but are not accurate enough. On the other hand, the global link predicti...
متن کاملScalable Ordinal Embedding to Model Text Similarity
Practitioners of Machine Learning and related fields commonly seek out embeddings of object collections into some Euclidean space. These embeddings are useful for dimensionality reduction, for data visualization, as concrete representations of abstract notions of similarity for similarity search, or as features for some downstream learning task such as web search or sentiment analysis. A wide a...
متن کاملSimbed: Similarity-Based Embedding
Simbed, standing for similarity-based embedding, is a new method of embedding high-dimensional data. It relies on the preservation of pairwise similarities rather than distances. In this respect, Simbed can be related to other techniques such as stochastic neighbor embedding and its variants. A connection with curvilinear component analysis is also pointed out. Simbed differs from these methods...
متن کاملInteractive Textbooks; Embedding Image Processing Operator Demonstrations in Text
Traditional image processing teaching has used materials where the theory and drill are separated into textbooks and image processing packages. HTML and JAVA might allow easier construction of an integrated teaching resource. Such a resource would have widespread, platform-independent accessibility. This paper reports our assessment of this potential, which is explored through extensions of the...
متن کاملFeatures Based Text Similarity Detection
As the Internet help us cross cultural border by providing different information, plagiarism issue is bound to arise. As a result, plagiarism detection becomes more demanding in overcoming this issue. Different plagiarism detection tools have been developed based on various detection techniques. Nowadays, fingerprint matching technique plays an important role in those detection tools. However, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Information Visualization
سال: 2022
ISSN: ['1473-8716', '1473-8724']
DOI: https://doi.org/10.1177/14738716221114372